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Abstract 



Let P be a set of n points in M. d . We present a linear-size data structure for answering 
range queries on P with constant-complexity semialgebraic sets as ranges, in time close to 
0(n 1_1 / d ). It essentially matches the performance of similar structures for simplex range 
searching, and, for d > 5, significantly improves earlier solutions by the first two authors 
obtained in 1994. This almost settles a long-standing open problem in range searching. 

The data structure is based on the polynomial-partitioning technique of Guth and Katz 
|arXiv:1011.4105|, which shows that for a parameter r, 1 < r < n, there exists a d-variate 
polynomial / of degree O^ 1 ^) such that each connected component of R d \ Z(f) contains 
at most n/r points of P, where Z(f) is the zero set of /. We present an efficient randomized 
algorithm for computing such a polynomial partition, which is of independent interest and is 
likely to have additional applications. 
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1 Introduction 



Range searching. Let P be a set of n points in W 1 , where d is a small constant. Let T be a family 
of geometric "regions," called ranges, in W 1 , each of which can be described algebraically by some 
fixed number of real parameters. For example, T can be the set of all axis-parallel boxes, balls, 
simplices, or cylinders, or the set of all intersections of pairs of ellipsoids. The T-range searching 
problem can be defined as: Preprocess P into a data structure so that the number of points of P 
lying in a query range 7 £ T can be counted efficiently. Actually, similar to many previous papers, 
we consider a more general setting, where one assumes a weight function on the points in P and 
asks for the cumulative weight of the points in P n 7. The weights are assumed to belong to 
a semigroup, i.e., subtractions are not allowed. We assume that the semigroup operation can be 
executed in constant time. 

In this paper we consider the case in which Y is a set of constant-complexity semialgebraic sets. 
We recall that a semialgebraic set is a subset of M. d obtained from a finite number of sets of the form 
{x 6 M. d I g(x) > 0}, where g is a <i-variate polynomial with integer coefficients Q by Boolean 
operations (unions, intersections, and complementations). Specifically, we let T^a.s denote the 
family of all semialgebraic sets in M. d defined by at most s polynomial inequalities of degree at 
most A each. If d, A, s are all regarded as constants, we refer to the sets in a,s as constant- 
complexity semialgebraic sets (such sets are sometimes also called Tarski cells). By semialgebraic 
range searching we mean T^^s-range searching for some parameters d, A, s (although in most 
applications the actual collection 7 of ranges is only a restricted subset of such a collection Td,A,s)- 
Besides being interesting in its own right, semialgebraic range searching also arises in a wide range 
of geometric searching problems, such as searching for a point nearest to a query geometric object, 
counting the number of input objects intersecting a query object, and many others. 

This paper focuses on the low storage version of range searching with constant-complexity 
semialgebraic sets — the data structure is allowed to use only linear or near-linear storage, and the 
goal is to make the query time as small as possible. As is typical in computational geometry, we will 
use the real RAM model of computation, where we can compute exactly with arbitrary real numbers 
and each arithmetic operation is executed in constant time. 

Previous work. Motivated by a wide range of applications, several variants of range searching 
have been studied in computational geometry and database systems for more than three decades. See 
JT][23| for comprehensive surveys of this topic. The early work focused on the so-called orthogonal 
range searching, where the ranges are axis-parallel boxes. After three decades of extensive work 
on this particular case, some basic questions still remain open. However, geometry plays almost no 
role in the known data structures for orthogonal range searching. 

The most basic and most studied truly geometric instance of range searching is with half spaces, 
or more generally simplices, as ranges. Studies in the early 1990s have essentially determined 
the optimal trade-off between the worst-case query time and the storage (and preprocessing time) 
required by any data structure for simplex range searching^] Lower bounds for this trade-off have 

'The usual definition of a semialgebraic set requires these polynomials to have integer coefficients. However, for our 
purposes, since we are going to assume the real RAM model of computation, we can actually allow for arbitrary real 
coefficients without affecting the asymptotic overhead. 

2 This applies when d is considered fixed and the implicit constants in the asymptotic notation may depend on d. This 
is the setting in all the previous papers, including the present one. Of course, in practical applications, this assumption 
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been given by Chazelle [7] under the semigroup model of computation, where subtraction of the 
point weights is not allowed. It is possible that, say, the counting version of the simplex range 
searching problem, where we ask just for the number of points in the query simplex, might admit 
better solutions using subtractions, but no such solutions are known. Moreover, there are recent 
lower-bound results when subtractions are also allowed; see fl9] and references therein. 



The data structures proposed for simplex range searching over the last two decades |21 22} 
match the known lower bounds within polylogarithmic factors. The state-of-the-art upper bounds 
are by (i) Chan j5J, who, building on many earlier results, provides a linear-size data structure 
with 0(n log n) expected preprocessing time and 0(n 1_1//rf ) query time, and (ii) Matousek 0221, 
who provides a data structure with 0{n d ) storage, 0((logn) rf+1 ) query time, and 0(n d (\ogn) e ) 
preprocessing timej^] A trade-off between space and query time can be obtained by combining these 
two data structures (22j . 

Yao and Yao [32] were perhaps the first to consider range searching in which ranges were 
delimited by graphs of polynomial functions. Agarwal and Matousek [2] have introduced a sys- 
tematic study of semialgebraic range searching. Building on the techniques developed for simplex 
range searching, they presented a linear-size data structure with 0(n 1_1 / b+e ) query time, where 
b = max.(d, 2d — A). For d < 4, this almost matches the performance for the simplex range search- 
ing, but for d > 5 there is a gap in the exponents of the corresponding bounds. Also see 1 28 1 for 
related recent developments. 

The bottleneck in the performance of the just mentioned range-searching data structure of (2j 
is a combinatorial geometry problem, known as the decomposition of arrangements into constant- 
complexity cells. Here, we are given a set £ of t algebraic surfaces in M. d (i.e., zero sets of d- 
variate polynomials), with degrees bounded by a constant Ao, and we want to decompose each 
cell of the arrangement -A(X) (see Section|4]for details) into subcells that are constant-complexity 
semialgebraic sets, i.e., belong to T^^a^ for some constants A (bound on degrees) and s (number 
of defining polynomials), which may depend on d and Ao, but not on t. The crucial quantity is the 
total number of the resulting subcells over all cells of A(T>); namely, if one can construct such a 
decomposition with 0{t b ) subcells, with some constant b, for every t and S, then the method of Q 
yields query time 0(n 1-1 / b+E ) (with linear storage). The only known general-purpose technique for 
producing such a decomposition is the so-called vertical decomposition (8 27 1, which decomposes 
A(E) into roughly i 2d ~ 4 Tarski cells, for d > 4 (l8||27). 

An alternative approach, based on linearization, was also proposed in Q. It maps the semialge- 
braic ranges in M d to simplices in some higher-dimensional space and uses simplex range searching 
there. However, its performance depends on the specific form of the polynomials defining the 
ranges. In some special cases (e.g., when ranges are balls in M. d ), linearization yields better query 
time than the decomposition-based technique mentioned above, but for general constant-complexity 
semialgebraic ranges, linearization has worse performance. 



Our results. In a recent breakthrough, Guth and Katz fT2l have presented a new space de- 
composition technique, called polynomial partitioning. For a set P C M. d of n points and a real 
parameter r, 1 < r < n, an r -partitioning polynomial for P is a nonzero ci-variate polynomial 

may be unrealistic unless the dimension is really small. However, the known lower bounds imply that if the dimension is 
large, no efficient solutions to simplex range searching exist, at least in the worst-case setting. 

3 Here and in the sequel, e denotes an arbitrarily small positive constant. The implicit constants in the asymptotic 
notation may depend on it, generally tending to infinity as e decreases to 0. 
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/ such that each connected component of M. d \ Z(f) contains at most n/r points of P, where 
Z(f) := {x eR d \ f(x) = 0} denotes the zero set of /. The decomposition of W 1 into Z(f) and 
the connected components of M. d \ Z(f) is called a polynomial partition (induced by /). Guth and 
Katz show that an r-partitioning polynomial of degree 0{r l / d ) always exists, but their argument 
does not lead to an efficient algorithm for constructing such a polynomial, mainly because it relies 
on ham-sandwich cuts in high-dimensional spaces, for which no efficient construction is known. 
Our first result is an efficient randomized algorithm for computing an r-partitioning polynomial. 

Theorem 1.1. Given a set P of n points in M. d , for some fixed d, and a parameter r < n, an 
r-partitioning polynomial for P of degree (^(r 1 /^) can be computed in randomized expected time 
0(nr + r 3 ). 

Next, we use this algorithm to bypass the arrangement-decomposition problem mentioned above. 
Namely, based on polynomial partitions, we construct partition trees |TJ|23j that answer range 
queries with constant-complexity semialgebraic sets in near-optimal time, using linear storage. An 
essential ingredient in the performance analysis of these partition trees is a recent combinatorial 
result of Barone and Basu |3], originally conjectured by the second author, which deals with the 



complexity of certain kinds of arrangements of zero sets of polynomials (see Theorem 4.2 1. While 
there have already been several combinatorial applications of the Guth-Katz technique (the most im- 
pressive being the original one in fT2|, which solves the famous Erdos's distinct distances problem, 



and others presented in p4l[T3|[29l[34|), ours seems to be the first algorithmic application. 



We establish two range-searching results, both based on polynomial partitions. For the first 
result, we need to introduce the notion of D-general position, for an integer D > 1. We say that a 
set P C M. d is in D-general position if no k points of P are contained in the zero set of a nonzero 
d-variate polynomial of degree at most D, where k := ( D ^ d ) ■ This is the number one expects for a 
"generic" point set. Indeed, d-variate polynomials of degree at most D have at most k — 1 distinct 
nonconstant monomials, from which it follows that any set of k — 1 points in R d is contained in the 



zero set of a ci-variate polynomial of degree at most D; e.g., see 1 10 1 1 1. 



Theorem 1.2. Let d, A, s and e > be constants. Let P C M. d be an n-point set in D^-general 
position, where Do is a suitable constant depending on d, A, and e. Then the T ^^^-range search- 
ing problem for P can be solved with O(n) storage, 0{n log n) expected preprocessing time, and 
0(n l ~ l l d+£ ) query time. 

We note that both here and in the next theorem, while the preprocessing algorithm is random- 
ized, the queries are answered deterministically, and the query time is worst-case. 

Of course, we would like to handle arbitrary point sets, not only those in .Do-general position. 
This can be achieved by an infinitesimal perturbation of the points of P. A general technique known 
as "simulation of simplicity" (in the version considered by Yap (33J) ensures that the perturbed set 
P' is in L>o-general position. If a point p G P lies in the interior of a query range 7, then so does 
the corresponding perturbed point pf G P', and similarly for p in the interior of M. d \ 7. However, 
for p on the boundary of 7, we cannot be sure if p' ends up inside or outside 7. 

Let us say that a boundary-fuzzy solution to the T^A.s-range searching problem is a data struc- 
ture that, given a query 7 G T^ a,s> returns an answer in which all points of P in the interior of 7 are 
counted and none in the interior of IR^ \ 7 is counted, while each point p G P on the boundary of 7 
may or may not be counted. In some applications, we can think of the points of P being imprecise 
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anyway (e.g., their coordinates come from some imprecise measurement), and then boundary-fuzzy 
range searching may be adequate. 

Corollary 1.3. Let d,A,s, and e > be constants. Then for every n-point set in Mr, there is 
a boundary-fuzzy T dA,s' ran S e searching data structure with 0(n) storage, 0(n log n) expected 
preprocessing time, and 0(n 1_1//d+e ) query time. 

Actually, previous results on range searching that use simulation of simplicity to avoid degen- 
erate cases also solve only the boundary-fuzzy variant. However, the previous techniques, even if 
presented only for point sets in general position, can usually be adapted to handle degenerate cases 
as well, perhaps with some effort, which is nevertheless routine. For our technique, degeneracy 
appears to be a more substantial problem because it is possible that a large subset of P (maybe even 
all of P) is contained in the zero set of the partitioning polynomial /, and the recursive divide-and- 
conquer mechanism yielded by the partition of / does not apply to this subset. 

Partially in response to this issue, we also present a different data structure that, at a somewhat 
higher preprocessing cost, not only gets rid of the boundary-fuzziness condition but also has a 
slightly improved query time (in terms of n). The main idea is that we build an auxiliary recursive 
data structure to handle the potentially large subset of points that lie in the zero set of the partitioning 
polynomial. 

Theorem 1.4. Let d, A, s, and e > be constants. Then the T dAs~ ran S e searching problem for an 
arbitrary n-point set in M d can be solved with 0(n) storage, 0(n 1+e ) expected preprocessing time, 
and 0(n l ~ l l d \og B n) query time, where B is a constant depending on d, A, s and e. 

We remark that the dependence of B on A, s, and e is reasonable, but its dependence on d is 
superexponential. 



Roadmap of the paper. Our algorithm is based on the polynomial partitioning technique by Guth 
and Katz, and we begin by briefly reviewing it in Section[2| Next, in Section[3] we describe the ran- 
domized algorithm for constructing such a partitioning polynomial. Section|4]presents an algorithm 
for computing the cells of a polynomial partition that are crossed by a semialgebraic range, and 
discusses several related topics. Section [5] presents our first data structure, stated in Theorem 1.2 



Section [6] describes the method for handling points lying on the zero set of the partitioning polyno- 
mial, and Section [7] presents our second data structure. We conclude in Section[8]by mentioning a 
few open problems. 



2 Polynomial Partitions 

In this section we briefly review the Guth-Katz technique for later use. We begin by stating their 
result. 

Theorem 2.1 (Guth-Katz |l2[). Given a set P ofn points in M. d and a parameter r < n, there exists 
an r -partitioning polynomial for P of degree at most 0(r 1//d ) (for d fixed). 

The degree in the theorem is asymptotically optimal in the worst case because the number of 
connected components of M. d \ Z(f) is 0((deg f) d ) for every polynomial / (see, e.g., Warren pi] 
Theorem 2]). 



4 



Sketch of proof. The Guth-Katz proof uses the polynomial ham sandwich theorem of Stone and 
Tukey |30J , which we state here in a version for finite point sets: If A\, . . . , are finite sets in M d 
and D is an integer satisfying ( D ^ d ) —l>k, then there exists a nonzero polynomial f of degree at 
most D that simultaneously bisects all the sets A%. Here "/ bisects A" means that / > in at most 
[|j4i|/2j points of A4 and / < in at most [|^4i|/2j points of Af, f might vanish at any number of 
the points of Ai, possibly even at all of them. 

Guth and Katz inductively construct collections "Pq, CPi , . . . , !P m of subsets of P. For j = 
0,1, ... ,m, Tj consists of at most 2- ? pairwise-disjoint subsets of P, each of size at most n/2 J ; 
the union [j "Pj does not have to contain all points of P. 

Initially, we have = {P}. The algorithm stops as soon as each subset in T m has at most 
n/r points. This implies that m < [log 2 r]. Having constructed we use the polynomial 

ham-sandwich theorem to construct a polynomial fj that bisects each set of "Pj-i, with deg fj = 
0(2^ d ) (this is indeed an asymptotic upper bound for the smallest D satisfying ( D ^ d ) — 1 > 2 3 ' -1 , 
assuming d to be a constant). For every subset Q € Pj-i, let Q + = {q € Q \ fj(q) > 0} and 
Q~ = {q G Q I fj(q) < 0}. We set "?j := {Q + , Q~ \ Q 6 CPj_i}; empty subsets are not included 

inoy 

The desired r-partitioning polynomial for P is then the product / := /1/2 • • • fm- We have 

m m 

deg / = J2 de %fj = E°( 2j/d ) = °( rl/ ")- 
3=1 3=1 

By construction, the points of P lying in a single connected component of W 1 \ Z(f) belong to a 
single member of !P m , which implies that each connected component contains at most n/r points 
of P. □ 

Sketch of proof of the Stone-Tukey polynomial ham-sandwich theorem. We begin by observing 
that { D ^ d ) — 1 is the number of all nonconstant monomials of degree at most D in d variables. 
Thus, we fix a collection M of k < ( D ^ d ) — 1 such monomials. Let $ : W 1 — > M fc be the corre- 
sponding Veronese map, which maps a point x = (x\, . . . , x^) £ M d to the /c-tuple of the values at 
(xi, . . . , xf) of the monomials from M. For example, for d = 2, D = 3, and k = 8 < ( 3 ^ 2 ) — 1, 
we may use <&{x\,X2) = [x\,X2,x\,x\X2,x\,x\,x\x2,x\xlf) E M 8 , where M is the set of the 
eight monomials appearing as components of <3>. 

Let ^ := $(Ai) C R k be the image of the given A{ under this Veronese map, for i = 1, . . . , k. 



By the standard ham-sandwich theorem (see, e.g., (24|), there exists a hyperplane h in M. k that 
simultaneously bisects all the Bi's, in the sense that each open halfspace bounded by h contains 
at most half of the points of each of the sets Bi. In a more algebraic language, there is a nonzero 
/c-variate linear polynomial, which we also call h, that bisects all the Sj's, in the sense of being 
positive on at most half of the points of each Bi, and being negative on at most half of the points 
of each Bi. Then / := h o is the desired d-variate polynomial of degree at most D bisecting all 
theA's. □ 



3 Constructing a Partitioning Polynomial 

In this section we present an efficient randomized algorithm that, given a point set P and a param- 
eter r < n, constructs an r-partitioning polynomial. The main difficulty in converting the above 
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proof of the Guth-Katz partitioning theorem into an efficient algorithm is the use of the (standard) 
ham-sandwich theorem in a possibly high-dimensional space A straightforward algorithm for 
computing ham-sandwich cuts in M fc inspects all possible ways of splitting the input point sets by 
a hyperplane, and has running time about n k . Compared to this easy upper bound, the best known 
ham-sandwich algorithms can save a factor of about n [20], but this is insignificant in higher dimen- 
sions. A recent result of Knauer, Tiwari, and Werner [17] shows that a certain incremental variant 
of computing a ham-sandwich cut is ^[lj-hard (where the parameter is the dimension), and thus 
one perhaps should not expect much better exact algorithms. 

We observe that the exact bisection of each Ai is not needed in the Guth-Katz construction — it 
is sufficient to replace the Stone-Tukey polynomial ham-sandwich theorem by a weaker result, as 
described below. 

Constructing a well-dissecting polynomial. We say that a polynomial / is well-dissecting for a 
point set A if / > on at most | \A\ points of A and / < on at most ~ | A\ points of A. Given point 
sets Ai, . . . , Ak in M. d with n points in total, we present a Las- Vegas algorithm for constructing a 
polynomial / of degree 0{k l / d ) that is well-dissecting for at least \k/2] of the Ai's. 

As in the above proof of the Stone-Tukey polynomial ham-sandwich theorem, let D be the 
smallest integer satisfying ( D ^ d ) — 1 > k. We fix a collection M of k distinct nonconstant mono- 
mials of degree at most D, and let $ be the corresponding Veronese map. For each i = 1, 2, . . . , k, 
we pick a point aj G Ai uniformly at random and compute hi := &(cii). Let h be a hyperplane in M. k 
passing through b\, . . . , b^, which can be found by solving a system of linear equations, in 0(k 3 ) 
time. 

If the points &i, . . . ,bf. are not affmely independent, then h is not determined uniquely (this 
is a technical nuisance, which the reader may want to ignore on first reading). In order to handle 
this case, we prepare in advance, before picking the cij's, auxiliary affinely independent points 
qi, . . . , qk in M fc , which are in general position with respect to &(Ai), . . . , $(^4^); here we mean 
the "ordinary" general position, i.e., no unnecessary affine dependences, that involve some of the 
%'s and the other points, arise. The points qi can be chosen at random, say, uniformly in the unit 
cube; with high probability, they have the desired general position property. (If we do not want 
to assume the capability of choosing a random real number, we can pick the g^'s uniformly at 
random from a sufficiently large discrete set.) If the dimension of the affine hull of b\, . . . , b^ is 
k' < k — 1, we choose the hyperplane h through b\, . . . , b^ and q%, . . . , qk-k'-i- If h is not unique, 
i.e., qi , . . . q^-k'-i are not affinely independent with respect to b\ , . . . which we can detect while 
solving the linear system, we restart the algorithm by choosing q\ , . . . , q^ anew and then picking 
new ai, . . . , afc. In this way, after a constant expected number of iterations, we obtain the uniquely 
determined hyperplane h through &i, . . . , 6& and qi, . . . , qk-k'-i as above, and we let / = h o $ 
denote the resulting <i-variate polynomial. We refer to these steps as one trial of the algorithm. For 
each Ai, we check whether / is well-dissecting for Ai. If f is well-dissecting for only fewer than 
k/2 sets, then we discard / and perform another trial. 

We now analyze the expected running time of the algorithm. The intuition is that / is expected 
to well-dissect a significant fraction, say, at least half, of the sets Ai. This intuition is reflected in 
the next lemma. Let Xi be the indicator variable of the event: Ai is not well-dissected by f. 

Lemma 3.1. For every i = l,2,...,k, EpQ] < 1 /4. 

Proof. Let us fix i and the choices of aj (and thus of bj = ^(aj)) for all j ^ i. Let fco be the 
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Figure 1. Illustration to the proof of Lernma [37l] 



dimension of Fq, the affine hull of {bj \ j ^ i}. Then the resulting hyperplane h passes through the 
(k — 2)-flat F spanned by Fq and q\ , . . . , q k _ ko _ 2 , irrespective of which point of A\ is chosen. If a^, 
the point chosen from Ai, is such that bi = ^(ai) lies on Fq, then h also passes through qk-k -i- 

Put Bi := &(Ai), and let us project the configuration orthogonally to a 2-dimensional plane 
7r orthogonal to F. Then F appears as a point F* £ n, and Bi projects to a (multi)set B* in n. 
The random hyperplane h projects to a random line h* in n, whose choice can be interpreted as 
follows: pick b* 6 B* uniformly at random; if b* / F*, then h* is the unique line through b* 
and F*; otherwise, when b* = F* , h* is the unique line through F* and q k _ k _ x ; by construction, 
q* k _ ko _i 7^ i 7 *- The indicator variable Xj is 1 if and only if the resulting h* has more than \\B*\ 
points of B* (strictly) on one side. 

The special role of q k _ k _j can be eliminated if we first move the points of B* coinciding with 
F* to the point q k _ ko _ v and then slightly perturb the points so as to ensure that all points of B* are 
distinct and lie at distinct orientations from F*; it is easy to see that these transformations cannot 
decrease the probability of Xi = 1. Finally, we note that the side of h* containing a point b* e B* 
only depends on the orientation of the vector F*b\ so we can also assume the points of B* to lie 
on the unit circle around F*. 

Using the standard planar ham-sandwich theorem, we partition B* into two subsets L* and R* 
of equal size by a line through the center F* . Then we bisect L* by a ray from F* , and we do the 
same for R*. It is easily checked (see Figure[T]) that there always exist two of the resulting quarters, 
one of L* and one of R* (the ones whose union forms an angle < it between the two bisecting rays), 
such that every line connecting F* with a point in either quarter contains at least \ \B* \ points of B* 
on each side. Referring to these quarters as "good", we now take one of the bisecting rays, say that 
of L*, and rotate it about F* away from the good quarter of L*. Each of the first %\B*\ points that 
the ray encounters has the property that the line supporting the ray has at least | \B* | points of B* 
on each side. This implies that, for at least half of the points in each of the two remaining quarters, 
the line connecting F* to such a point has at least | \B* \ points of B* on each side. Hence at most 
\\Bi\ points of Bi can lead to a cut that is not well-dissecting for Bi. 

We conclude that, still conditioned on the choices of aj, j ^ i, the event Xi = 1 has probability 
at most 1/4. Since this holds for every choice of the aj, j ^ i, the unconditional probability of 
Xi = 1 is also at most 1/4, and thus E[Xi] < 1/4 as claimed. □ 
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Hence, the expected number of sets Aj that are not well-dissected by / is 



i=i 



By Markov's inequality, with probability at least 1/2, at least half of the Aj's are well-dissected 
by /. We thus obtain a polynomial that is well-dissecting for at least half of the Aj's after an 
expected constant number of trials. 

It remains to estimate the running time of each trial. The points bi,...,bk can be chosen in 
0(n) time. Computing h involves solving a k x k linear system, which can be done in 0(k 3 ) time 
using Gaussian elimination, or even faster using fast matrix multiplication. Note that we do not 
need to actually compute the entire sets <&(Aj). No computation is needed for passing from h to 
/ — we just re-interpret the coefficients. To check which of A\ , . . . A^ are well-dissected by /, we 
evaluate / at each point of A = Aj. First we evaluate each of the k monomials in M at each 
point of A. If we proceed incrementally, from lower degrees to higher ones, this can be done with 
0(1) operations per monomial and point of A, in 0(nk) time in total. Then, in additional 0(nk) 
time, we compute the values of f(q), for all q G A, from the values of the monomials. Putting 
everything together we obtain the following lemma. 

Lemma 3.2. Given point sets A\, . . . ,Af. in M. d (for fixed d) with n points in total, a polynomial 
f of degree 0(k 1 / d ) that is well-dissecting for at least \k/2] of the Ai's can be constructed in 
0(nk + fc 3 ) randomized expected time. 

Constructing a partitioning polynomial: Proof of Theorem \l.l\ We now describe the algorithm 
for computing an r-partitioning polynomial /. We essentially imitate the Guth-Katz construction, 
with Lemma [3~2| replacing the polynomial ham-sandwich theorem, but with an additional twist. 

The algorithm works in phases. At the end of the j-th phase, for j > 1, we have a family 
fi , . . . , fj of j polynomials and a family 7j of at most 2- ? pairwise-disjoint subsets of P, each of 
size at most (7/8) J n. Similar to the Guth-Katz construction, CPj is not necessarily a partition of P, 
since the points of P n Z(/i/2 • • • fj) do not belong to [j "Pj. Initially, CPq = {P}. The algorithm 
stops when each set in CPj has at most n/r points. In the j-th phase, the algorithm constructs fj and 
CP, as follows. 

At the beginning of the j-th phase, let Lj = {Q G CPj_i | \Q\ > (7/8)%} be the family 
of the "large" sets in "Pj-i, and set kj = \Lj\ < (8/7) J . We also initialize the collection Tj to 
CPj_i \ Lj, the family of "small" sets in Tj-i- Then we perform at most [log 2 Kj] dissecting steps, 
as follows: After s steps, we have a family g\, . . . , g s of polynomials, the current set CP.,-, and a 
subfamily £y C Lj of size at most Kj/2 S , consisting of the members of Lj that were not well- 



dissected by any of gi, . . . , g s . If Lj ^ we choose, using Lemma 3.2 a polynomial g s+ \ of 
degree at most c(nj/2 s ) 1 ^ d (with a suitable constant c that depends only on d) that well-dissects 
at least half of the members of La . For each Q G ^j S ^ l et Q + = {q G Q I 9s+i(q) > 0} 
and Q~ = {q G Q \ g s +i(q) < 0}. If Q is well-dissected, i.e., |Q + |,|Q~| < then we 

add Q + , Q~ to CPj, and otherwise, we add Q to L^ +1 \ Note that in the former case the points 
q G Q satis fying g s +i(q) = are "lost" and do not participate in the subsequent dissections. By 
Lemma ; 



3.2 
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The j-th phase is completed when Lj = 0, in which case we sejj/j := Yit=i 9£- By construc- 
tion, each point set in Tj has at most (7/8) J n points, and the points of P not belonging to any set 
of Tj lie in Z(f\ • • • fj). Furthermore, 

degf j <J2<Kj/2 S ) 1/d = 0(K 1 / d ), 

where again the constant of proportionality depends only on d. Since every set in is split into 
at most two sets before being added to "Pj, < 2 1 CP^ _ 1 1 < 2 J . 

If Tj contains subsets with more than n/r points, we begin the (j + l)-st phase with the current 
CP, ; otherwise the algorithm stops and returns / := /1/2 • • • /j- This completes the description of 
the algorithm. 

Clearly, the number m of phases of the algorithm is at most flogg / 7 r] . Following the same 
argument as in |l2) , it can be shown that all points lying in a single connected component of M. d \ 
Z(f) belong to a single member of CP m , and thus each connected component contains at most n/r 
points of P. Since the degree of fj is 0(/ey), kj < (8/7)3, an d m < [log 8 / 7 r] , we conclude that 




= 0{r 1 ' d ). 



As for the expected running time of the algorithm, the s-th step of the j-th phase takes 0(nKj/2 s + 
(Kj/2 S ) 3 ) expected time, so the j-th phase takes a total of 0(nKj + expected time. Substituting 
Kj < [8/7y in the above bound and summing over all j, the overall expected running time of the 
algorithm is 0(nr + r 3 ). This completes the proof of Theorem Mr] □ 



Remark. Theorem is employed for the preprocessing in our range-searching algorithms in 
Theorems 1.2 a nd|1.4| In Theorem 1.2 we take r to be a large constant, and the expected running 
time in Theorem |l.l| is 0(n). However, in Theorem 1.4 we require r to be a small fractional power 
of n, say r = n auul . It is a challenging open problem to improve the expected running time in 
Theorem |1.1| to O (n poly log (n)) when r is such a small fractional power of n. The bottleneck 
in the current algorithm is the subproblem of evaluating a given d-variate polynomial / of degree 
D = 0(r l / d ) at n given points; everything else can be performed in 0(n polylog(r) + r°^) 
expected time. Finding the signs of / at those points would actually suffice, but this probably does 
not make the problem any simpler. 

This problem of multi-evaluation of multivariate real polynomials has been considered in the 
literature, and there is a nontrivial improvement over the straightforward 0(nr) algorithm, due to 
Niisken and Ziegler p5| . Concretely, in the bivariate case (d = 2), their algorithm can evaluate a 
bivariate polynomial of degree D < ^fn at n given points using 0(nD om7 ) arithmetic operations. 
It is based on fast matrix multiplication, and even under the most optimistic possible assumption on 
the speed of matrix multiplication, it cannot get below nD 1 / 2 . Although this is significantly faster 
than our naive 0(nr)-time algorithm, which is 0(nD 2 ) in this case, it is still a far cry from what we 
are aiming at. Let us remark that in a different setting, for polynomials over finite fields (and over 
certain more general finite rings), there is a remarkable method for multi-evaluation by Kedlaya and 
Umans [ 16] achieving 0(((n + D d ) \ogq) 1+£ ) running time, where q is the cardinality of the field. 



Note that fj is not necessarily well-dissecting, because it does not control the sizes of subsets with positive or with 
negative signs. 
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4 Crossing a Polynomial Partition with a Range 



In this section we define the crossing number of a polynomial partition and describe an algorithm 
for computing the cells of a polynomial partition that are crossed by a semialgebraic range, both of 
which will be crucial for our range-searching data structures. We begin by recalling a few results 
on arrangements of algebraic surfaces. 

Let S be a finite set of algebraic surfaces in U. d . The arrangement of S, denoted by A(S), is 
the partition of M. d into maximal relatively open connected subsets, called cells, such that all points 
within each cell lie in the same subset of surfaces of S (and in no other surface). If 3" is a set of 
d-variate polynomials, then with a slight abuse of notation, we use A{3) to denote the arrangement 
A({Z(f) | / G 9"}) of their zero sets. We need the following result on arrangements: 

Theorem 4.1 (Basu, Pollack and Roy [5, Theorem 16.18]). Let 3" = {fx, . . . , / s } be a set of s real 
d-variate polynomials, each of degree at most A. Then the arrangement A(H) in M. d has at most 
(sA)°( d ) cells, and it can be computed in time at most T = s d+1 A°( rf4 ) . Each cell is described 
as a semialgebraic set using at most T polynomials of degree bounded by A°( d3 \ Moreover, the 
algorithm supplies adjacency information for the cells, indicating which cells are contained in the 
boundary of each cell, and it also supplies an explicitly given point in each cell. 

A key ingredient for the analysis of our range-searching data structure is the following recent 
result of Barone and Basu (3), which is a refinement of a series of previous studies; e.g., see |4j|5j: 

Theorem 4.2 (Barone and Basu [ 3 ]). Let V be a k-dimensional algebraic variety in ~R d defined by a 
finite set 9 of d-variate polynomials, each of degree at most A, and let J be a set of s polynomials of 
degree at most D > A. Then the number of cells of A(1\J 9) (of all dimensions) that are contained 
in V is bounded by 0(l) d A d - k (sD) k . 

The crossing number of polynomial partitions. Let P be a set of n points in M. d , and let / 

be an r-partitioning polynomial for P. Recall that the polynomial partition Q = Q(f) induced 
by / is the partition of M rf into the zero set Z(f) and the connected components ux, u)i, . . . , 0Jt of 
W 1 \ Z(f). As already noted, Warren's theorem [31] implies that t = 0(r). We call uji, ... ,Ut 
the cells of Q, (although they need not be cells in the sense typical, e.g., in topology; they need not 
even be simply connected). also induces a partition P* , Pi, . . . , P t of P, where P* = P n Z(f) 
is the exceptional part, and Pi = P n cui, for i = 1, . . . , t, are the regular parts. By construction, 
\Pi\ < n/r for every % = 1,2, ... ,t, but we have no control over the size of P* — this will be the 
source of most of our technical difficulties. 

Next, let 7 be a range in Td a,s- We say that 7 crosses a cell Wj if neither cjj C 7 nor uji D 7 = 0. 
The crossing number of 7 is the number of cells of Q crossed by 7, and the crossing number of O 
(with respect to T^a.s) i s the maximum of the crossing numbers of all 7 G T^a.s- Similar to many 
previous range-searching algorithms (6||2T]|22|, the crossing number of Q will determine the query 
time of our range-searching algorithms described in Sections [5] and [7] 

Lemma 4.3. If £1 is a polynomial partition induced by an r-partitioning polynomial of degree at 
most D, then the crossing number ofQ with respect to T^ a s> with A < D, is at most CsAD^ 1 , 
where C is a suitable constant depending only on d. 
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Proof. Let 7 G 1^ a,«; then 7 is a Boolean combination of up to s sets of the form jj := {x E | 
ffj(a^) > 0}, where g±,. . . ,g s are polynomials of degree at most A. If 7 crosses a cell Ui, then at 
least one of the ranges jj also crosses uii, and thus it suffices to establish that the crossing number 
of any range 7, defined by a single <i-variate polynomial inequality g(x) > of degree at most A, 
is at most Ck.D d ~ l . 



We apply Theorem 4.2 with V := Z{g), which is an algebraic variety of dimension k < d — 1, 
and with s = 1 and 3" = {/}, where / is the r-partitioning polynomial. Then, for each cell coi 
crossed by 7, cjj n Z(g) is a nonempty union of some of the cells in 71(3" U {g}) = A({f, g}) that 
lie in V. Thus, the crossing number of 7 is at most 0(l) d A.D d_1 , and multiplying this bound by s 
yields the bounded asserted in the lemma. □ 

Algorithmic issues. We need to perform the following algorithmic primitives (for d fixed as 
usual) for the range-searching algorithms that we will later present: 

(Al) Given an r-partitioning polynomial / of degree D = 0{r l / d ), compute (a suitable represen- 
tation of) the partition £1 and the induced partition of P into P* , P%, . . . , Pf. 



By computing A({f}), using Theorem 4.1 and then testing the membership of each point 
p £ Pin each cell u)i in time polynomial in r, the above operation can be performed in 
0(nr c ) timej^jwhere c = d°^. 

(A2) Given (a suitable representation of) $7 as in (Al) and a query range 7 6 a,i, i.e., a range 
defined by a single ti-variate polynomial g of degree A < D, compute which of the cells of 
Q are crossed by 7 and which are completely contained in 7. 

By computing the arrangement A({f, g}) and deducing the required classification of the cells 



LOi from the combinatorial information about the cells of this arrangement, using Theorem 4. 1 
the above task can be accomplished in time 0(r c ), with c as in (Al). 



5 Constant Fan-Out Partition Tree 

We are now ready to describe our first data structure for T^A^-range searching, which is a constant 
fan-out (branching degree) partition tree, and which works for points in general position. 



Proof of Theorem 1.2 Let P be a set of n points in M. d , and let A, s be constants. We choose 
r as a (large) constant depending on d, A, and e. We assume P to be in D^-geneval position for 
some sufficiently large constant Dq S> r x l d . We construct a partition tree T of fan-out 0(r) as 
follows. We first construct an r-partitioning polynomial / for P using Theorem and compute 
the partition of M. d induced by /, as well as the corresponding partition P = P* U P\ U • • • U Pt 
of P, where t = 0(r). Since r is a constant, the (Al) operation, discussed in Section |4j performs 
this computation in 0(n) time. Since P is in Do-general position, and since we choose Do to be at 
least deg /, the size of P* = P n Z(f) is bounded by D . 
We set up the root of T, where we store 



5 Of course, this is somewhat inefficient, and it would be nice to have a fast point-location algorithm for the partition 
Q — this would be the second step, together with an improved construction of an r-partitioning polynomial / (concretely, 
an improved multi-point evaluation procedure for /) as discussed at the end of Section|3] needed to improve the prepro- 



cessing time in Theorem 1 .4 
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(i) the partitioning polynomial /, and a suitable representation of the partition 17; 



(ii) a list of the points of the exceptional part P* ; and 

(iii) w(Pi), the sum of the weights of the points of Pi, for each i = 1,2, ... ,t. 

The regular parts Pi are not stored explicitly at the root. Instead, for each Pj we recursively build 
a subtree representing it. The recursion terminates, at leaves of T, as soon as we reach point sets 
of size smaller than a suitable constant uq. The points of each such set are stored explicitly at the 
corresponding leaf of 7. 

Since each node of 7 requires only a constant amount of storage and each point of P is stored 
at only one node of T, the total size of 7 is O(n). The preprocessing time is 0(n log n) since 7 has 
depth 0(log r n) and each level is processed in O(n) time. 

To process a query range 7 £ Fd,A,s> we start at the root of T and maintain a global counter 
which is initially set to 0. Among the cells c*?i, . . . , uit of the partition Q, stored at the root, we find, 
using the (A2) operation, those completely contained in 7, and those crossed by 7. Actually, we 
compute a superset of the cells that 7 crosses, namely, the cells crossed by the zero set of at least 
one of the (at most s) polynomials defining 7. For each cell uji C 7, we add the weight w(Pi) to 
the global counter. We also add to the global counter the weights of the points in P* n 7, which 
we find by testing each point of P* separately. Then we recurse in each subtree corresponding to 
a cell uji crossed by 7 (in the above weaker sense). The leaves, with point sets of size 0(1), are 



processed by inspecting their points individually. By Lemma 4.3 the number of cells crossed by 
any of the polynomials defining 7 at any interior node of 7 is at most CsAD^ 1 < C'r 1 ^ 1 ^, where 
C = C'(d, s, A) is a constant independent of r. 

The query time Q(n) obeys the following recurrence: 



Q(n) < 



C'r l - l l d Q{n/r) + O(l) for n > n , 
0(n) for n < no, 



Itis wellknown (e.g., see |21 1), and easy to check, that the recurrence solves to Q(n) = 0{n l 1 / d+£ ) 



for every fixed e > 0, with an appropriate sufficiently large choice of r as a function of C and e, 



and with an appropriate choice of no- This concludes the proof of Theorem 1.2 □ 



Boundary-fuzzy range searching: Proof of Corollary \L3\ Now we consider the case where the 
points of P are not necessarily in Do-g enera l position. We perturb them infinitesimally using "simu- 
lation of simplicity" (see below for more details), so that the perturbed set is in Do-g enera l position. 
As can easily be checked, all the branching steps in the range searching algorithm presented above 
are based on the sign of suitable polynomials in the coordinates of the input points and in the co- 
efficients of the polynomials in the query range, and the degrees of these polynomials are bounded 
by a constant, albeit a large one. For example, for producing the partitioning polynomial /, we 
solve systems of linear equations, and thus the coefficients of / are given by certain determinants, 
obtained from Cramer's rule. Similarly, the condition of Do-g enera l position can be expressed as 
non-vanishing of suitable polynomials in the point coordinates (also see, e.g., |l4l Lemma 6.3]). 

We can perform the infinitesimal perturbation using the general perturbation scheme of Yap 
(33j. The overhead per operation required by this perturbation scheme is considerable but still it 
multiplies the bound by only a constant factor, since the degrees of all the polynomials involved are 
bounded by a constant. Then we construct the above data structure on the perturbed point set, which 
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Figure 2. The zero set of the partitioning polynomial (left), and partitioning it into monotone patches that project to the 
hyperplane H bijectively. Only the 1-dimensional patches are labeled. 



is in Do-general position. By answering the query for the intinitesimally perturbed set, we obtain 
a boundary-fuzzy answer for the original point set. The preprocessing cost, storage, and query 
time remain asymptotically the same as in Theorem 1.2 (but, as just noted, with larger constants of 
proportionality). □ 



6 Decomposing a Surface into Monotone Patches 

As mentioned in the Introduction, if we construct an r-partitioning polynomial / for an arbitrary 
point set P, the exceptional set P* = P n Z(f) may be large, as is schematically indicated in 
Fig. [2] (left). Since P* is not partitioned by / in any reasonable sense, it must be handled differently, 
as described below. 

Following the terminology in [13 26 1, we call a direction v G § d_1 good for / if, for every 



a € Mr, the polynomial p(t) = /(a + vt) does not vanish identically; that is, any line in direction 
v intersects Z(f) at finitely many points. As argued in |26[ , a random direction is good for / with 
high probability. By choosing a good direction and rotating the coordinate system, we assume that 
the x^-direction, referred to as the vertical direction, is good for /. 

In order to deal with P* , we partition Z(f) into finitely many pieces, called patches, in such a 
way that each of the patches is monotone in the vertical direction, meaning that every line parallel 
to the Xrf-axis intersects it at most once. This is illustrated, in the somewhat trivial 2-dimensional 
setting, in Fig. [2] (right): there are five one-dimensional patches 7Ti, . . . , 7Ts, plus four O-dimensional 
patches. Then we treat each patch ir separately: We project the point set P* n tt orthogonally to the 
coordinate hyperplane H := {x^ = 0}, and we preprocess the projected set, denoted P*, for range 
searching with suitable ranges. These ranges are projections of ranges of the form 7 n it, where 
7 £ ^d,A,s 1S one °f tne original ranges. In Fig.[2](middle), the patch 7Ti is drawn thick, a range 7 is 
depicted as a gray disk, and the projection 7 7ri of 7 n 7ri is shown as a thick segment in H. 

The projected range 7^ is typically more complicated than the original range 7 (it involves 
more polynomials of larger degrees), but crucially, it is only (d — 1) -dimensional, and (d — 1)- 
dimensional queries can be processed somewhat more efficiently than <i-dimensional ones, which 
makes the whole scheme work. We will discuss this in more detail in Section [7] below, but first 
we recall the notion of cylindrical algebraic decomposition (CAD, or also Collins decomposition), 
which is a tool that allows us to decompose Z(f) into monotone patches, and also to compute the 
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Figure 3. A schematic illustration of the first-stage cylindrical algebraic decomposition, 
projected ranges 7^. 

Given a finite set 3" = . . . , / s } of (i-variate polynomials, a cylindrical algebraic decom- 
position adapted to 3" is a way of decomposing M. d into a finite collection of relatively open cells, 
which have a simple shape (in a suitable sense), and which refine the arrangement A(3 r ). We refer, 
e.g., to (51 Chap. 5.12] for a definition and construction of the "standard" CAD. Here we will use 
a simplified variant, which can be regarded as the "first stage" of the standard CAD, and which is 
captured by (5j Theorem 5.14, Algorithm 12.1]. We also refer to [26, Appendix A] for a concise 
treatment, which is perhaps more accessible at first encounter. 

Let 3~ be as above. To obtain the first-stage CAD for /, one constructs a suitable collection 
£ = £(3") of polynomials in the variables x\, . . . , x^-i (denoted by Elimx^S") in [5]). Roughly 
speaking, the zero sets of the polynomials in £, viewed as subsets of the coordinate hyperplane H 
(which is identified with M^^ 1 ), contain the projection onto H of all intersections Z{fi) n Z(fj), 
1 < * < j < s, as well as the projection of the loci in Z(fi) where has a vertical tangent hy- 
perplane, or a singularity of some kind. The actual construction of £ is somewhat more complicated, 
and we refer to the aforementioned references for more details. 

Having constructed £, the first-stage CAD is obtained as the arrangement A(3 r U £) in Mr, where 
the polynomials in £ are now considered as d-variate polynomials (in which the variable Xd is not 
present). In geometric terms, we erect a "vertical wall" in M. d over each zero set within H of a 
(d — l)-variate polynomial from £, and the CAD is the arrangement of these vertical walls plus the 
zero sets of f\, . . . , f s . The first-stage CAD is illustrated in Fig. [3] for the same (single) polynomial 
as in Fig. [2] (left). 

In our algorithm, we are interested in the cells of the CAD that are contained in some of the 
Z(fi); these are going to be the monotone patches alluded to above. We note that using the first- 
stage CAD for the purpose of decomposing Z(f) into monotone patches seems somewhat wasteful. 
For example, the number of patches in Fig. [2] is considerably smaller than the number of patches 
in the CAD in Fig. [3j But the CAD is simple and well known, and possible improvements in 
the number of patches do not seem to influence our asymptotic bounds on the performance of the 
resulting range-searching data structure. 

The following lemma summarizes the properties of the first-stage CAD that we will need; we 
refer to pi Theorem 5. 14, Algorithm 12.1] for a proof. 

Lemma 6.1 (Single-stage CAD). Given a set J = {/1 , . . . , f s } C . . . , xj\ of polynomials, 

each of degree at most D, there is asetL = £(3") of 0(s 2 D 3 ) polynomials in IR[xi, . . . , Xd-i], each 
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of degree 0(D 2 ), which can be computed in time s 2 D°( d \ such that the first-stage CAD defined by 
these polynomials, i.e., the arrangement A(3 'U £) in M d , has the following properties: 

(i) ("Cylindrical" cells) For each cell a of A(J U £), there exists a unique cell r of the (d— 1)- 
dimensional arrangement A(£.) in H, such that one of the following possibilities occur: 

(a) a = {(x, | x G r}, where £ : r — >■ M is a continuous semialgebraic function (that 
is, a is the graph o/£ above t). 

(b) a = {(x, t) \ x £ T,t £ (£i(x), ^(x))}, where each £j, i = 1,2, is either a continuous 
semialgebraic real-valued function on t, or the constant function t — > {oo}, or the 
constant function r — > {— oo}, and £i(x) < ^2( x )f or all x £ t (that is, a is a portion 
of the "cylinder" r x R between two consecutive graphs). 

(ii) (Refinement property) If J' C 3", £' = £(3~') C £, and thus each cell of ' A(J 'U £) « 

contained in some cell of A(J' U £'). 

Returning to the problem of decomposing the zero set of the partitioning polynomial / into 
monotone patches, we construct the first-stage CAD for 9" = {/}, and the patches are the cells 
of A(fJ U £) contained in Z(f). If the a^-direction is good for /, then every cell of ^1(3^ U £) 
lying in Z(f) is of type (a), and so if any cell of type (b) lies in Z(f), we choose another random 
direction and construct the first-stage CAD in that direction. Putting everything together and using 



Theorem 4.1 to bound the complexity of A(3 U £), we obtain the following lemma. 



Lemma 6.2. Let f be a d-variate polynomial of degree D. and let us assume that the Xd-direction 
is good for f. Then Z(f) can be decomposed, in D°( d4 ) time, into D°( rf ) monotone patches, and 
each patch can be represented semialgebraically by D°( d4 ^ polynomials of degree D°^ d3 \ 

The first-stage CAD can also be used to compute the projection of the intersection of a range in 
T^a,s with a monotone patch of /. 

Lemma 6.3. Let JJ be the decomposition of the zero set of a d-variate polynomial f of degree D 



into monotone patches, as described in Lemma 6.2 and let 7 be a semialgebraic set in Td,A,s, with 



A < D. For every patch n € IT, the projection 0/7 (1 7r in the Xd-direction can be represented 
as a member ofTd-i t A 1 ,s 1 > i- e -> by a Boolean combination of at most si polynomial inequalities 
in (d — 1) variables, each of degree at most Ai, where A\ = D ( rf3 ) and s\ = (Ds)°( d4 \ The 
representation can be computed in (Ds) ^ d ^ time. 

Proof. The task of computing 7^, the projection of 7 n 7r, is similar to the operation (A2) discussed 
in Section [4] In more abstract terms, it can also be viewed as a quantifier elimination task: we can 
represent 7 n it by a quantifier-free formula <3?(xi, . . . , Xd) (a Boolean combination of polynomial 
inequalities); then 7^ is represented by 3xd&(xi, . . . ,Xd), and by eliminating 3xd we obtain a 
quantifier-free formula describing 7^. More concretely, we use a procedure based on the first-stage 



CAD (Lemma 6.1 1 and the arrangement construction (Theorem |4.1| ). 

By definition, 7 is a Boolean combination of inequalities of the form g± > 0, . . . , g s > 0, where 
<7i, . . . ,g s are d-variate polynomials, each of degree at most A < D. We se t g_ := {/, gi, ... , g s }, 



we compute the set £ = £(3\) of (d — l)-variate polynomials as in Lemma 6.1 and the first -stage 

Since 



CAD is the n co mputed as the d-dimensional arrangement ^l(3"U£) according to Theorem 



4.1 



by Lemma 6.1 'ii), A(3U £) refines .A({/} U £({/})) (the first-stage CAD from the preprocessing 
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phase), each patch ir G II is decomposed into subpatches. Since the sign of each g{ is constant on 
each cell of ^1(3"), and thus on each cell of A{7 U £), 7 n it is a disjoint union of subpatches. The 
projections of these subpatches into H are cells of A(E), and thu s we obtain, in time (Ds)°^ 
a representation of -y w as a member of Td-i a : 



by Theorem 4. 1 where Ai = D°( rf3 ) and 



si 



□ 



7 Large Fan-Out Partition Tree: Proof of Theorem 1.4 



We now describe our second data structure for T^A.s-range searching. Compared to the first data 
structure from Section [5] this one works on arbitrary point sets, without the Dq -general position 
assumption, or, alternatively, without the fuzzy boundary constraint on the output, and has slightly 
better performance bounds. The data structure is built recursively, and this time the recursion in- 
volves both n and d. 



7.1 The data structure 

Let P be a set of n points in M d , and let A and s be parameters (not assumed to be constant). The 
data structure for T^a,;* -range searching on P is obtained by constructing a partition tree 7 on P 
recursively, as above, except that now the fan-out of each node is larger (and non-constant), and 
each node also stores an auxiliary data structure for handling the respective exceptional part. We 
need to set two parameters: no = no(d, A, s) and r = r(d, s, A, n). Neither of them is a constant 
in general; in particular, r is typically going to be a tiny power of n. The specific values of these 
parameters will be specified later, when we analyze the query time. 

We also note that there is yet another parameter in Theorem |1.4| namely, the arbitrarily small 
constant e > entering the preprocessing time bound. However, e enters the construction solely 
by the requirement that r should be chosen smaller than n £//c , for a sufficiently large constant c. It 
will become apparent later in the analysis that r < rfl c can be assumed, provided that some other 
parameters are taken sufficiently large; we will point this out at suitable moments. 

When constructing the partition tree T on an n-point set P, we distinguish two cases. For 
n < no, T consists of a single leaf storing all points of P. For n > no, we construct an r-partitioning 
polynomial / of degree D = 0{r 1 l d ), the partition Q, of M d induced by /, and the partition of P 
into the exceptional part P* and regular parts Pi, . . . , P t , where t = 0(r). Set n* = \P*\ and 
n% = \Pi\, for i = 1, . . . , t. The root of T stores /, Q, and the total weight w{Pi) of each regular 
part Pi of P, as before. Still in the same way as before, we recursively preprocess each regular part 
Pi for r^A.s-range searching (or stop if \Pi\ < no), and attach the resulting data structure to the 
root as a respective subtree. 



Handling the exceptional part. A new feature of the second data structure is that we also pre- 
process the exceptional set P* into an auxiliary data structure, which is stored at the root. Here we 
recurse on the dimension, exploiting the fact that P* lies on the algebraic variety Z(f) of dimension 
at most d—1. 

We choose a random direction v and rotate the coordinate system so that v becomes the direction 
of the Xrf-axis. We construct the first-stage CAD adapted to {/}, according to Lemma 6.1 and 
Theorem 4. 1 We check whether all the patches are a^-monotone, i.e., of type (a) in Lemma 6. 1 i); if 
it is not the case, we discard the CAD and repeat the construction, with a different random direction. 
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This yields a decomposition of Z(f) into a set II of monotone patches, and the running time 

is D°^ 4 ' with high probability. 

Next, we distribute the points of P* among the patches: for each patch n E II, let P* denote the 
projection of P* n tt onto the coordinate hyperplane H = {x E M d | x d = 0}. We preprocess each 
set P* for Td-i Ai_,8i -range searching. Here s\ = (Ds)°( d ^ is the number of polynomials defining 
a range and Aj = D°( d ^ is their maximum degree; the constants hidden in the O(-) notation are the 
same as in Lemma 6.3 For simplicity, we treat all patches as being (d — 1) -dimensional (although 
some may be of lower dimension); this does not influence the worst-case performance analysis. 

The preprocessing of the sets P* is done recursively, using an r% -partitioning polynomial in 
for a suitable value of r\. The exceptional set at each node of the resulting "(c£ — ^-di- 
mensional" tree is handled in a similar manner, constructing an auxiliary data structure in d — 2 
dimensions, based on a first-stage CAD, and storing it at the corresponding node. The recursion on 
d bottoms out at dimension 1, where the structure is simply a standard binary search tree over the 
resulting set of points on the xi-axis. We remark that the treatment of the top level of recursion on 
the dimension will be somewhat different from that of deeper levels, in terms of both the choice of 
parameters and the analysis; see below for details. 

This completes the description of the data structure, except for the choice of r and no, which 
will be provided later as we analyze the performance of the algorithm. 



Answering a query. Let us assume that, for a given P, the data structure for T^A.s-range search- 
ing, as described above, has been constructed, and consider a query range 7 E T^a.s- The query 
is answered in the same way as before, by visiting the nodes of the partition tree 7 in a top-down 
manner, except that, at each node that we visit, we also query with 7 the auxiliary data structure 
constructed on the exceptional set P* for that node. 

Specifically, for each patch tt of the corresponding collection II, we compute the weight 
of P* n (7 n 7r). If 7 n 7r = then = 0, and if 7 n tt = tt then w n is the total weight of 
P* n tt. Otherwise, i.e., if 7 crosses tt, then is the same as the weight of P* n 7^, where 7^ is 
the Xd-projection of 7 n tt, because tt is x^-monotone. By Lemma 6.3 7^ E Fd-l,Ai,ai and can 
be constructed in (Ds) ^ 4 ^ time. We can therefore find the weight of 7^ n P* by querying the 
auxiliary data structure for P* with 7^. We then add w n to the global count maintained by the query 
procedure. This completes the description of the query procedure. 



7.2 Performance analysis 

The analysis of the storage requirement and preprocessing time is straightforward, and will be pro- 
vided later. We begin with the more intricate analysis of the query time. 

Let Qd(n, A, s) denote the maximum overall query time for a, s -range searching on a set of 
n points in R d . For d = 1, Qi(n, A, s) = O(Aslogn) because any range in ^i,a,b is tne union of 
at most As intervals. For n < no, Qd(n, A, s) = O(n). For d > 1 and n > uq, an analysis similar 
to the one in Section [5] gives the following recurrence for Qd(n, A, s): 

Q d (n, A, a) < CAsr l - l l d Q d (n/r, A, s) + ^ Q^K, Ai, a x ) + r c , (1) 

wen 

where c = d 0<yl \ C is a constant depending on d, J2 n n n < n, D = 0{r l / d ), and both | XT | and Aisj 
are bounded by (Ds) ad with a d = 0(d 4 ) (these are rather crude estimates, but we prefer simplicity). 
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The leading term of the recurrence relies on the crossing-number bound given in Lemma 4.3 In 
order to apply that lemma, we need that r > A d , which will be ensured by the choice of r given 
below. The second term corresponds to querying the auxiliary data structures for the exceptional set 
P* . The last term covers the time spent in computing the cells of the polynomial partition crossed 
by the query range 7 and for computing the projections 7^ for every ir G IT; here we assume that 
the choice of r will be such that r > Ds. 

Ultimately, we want to derive that if A, s are constants, the recurrence ([TJ implies, with a suit- 
able choice of r and no at each stage, 

Q d (n, A, s) < n 1 - 1 '* log B ^' A ^ n, (2) 

where B(d, A, s) is a constant depending on d, A, and s. 

However, as was already mentioned, even if A, s are constants initially, later in the recursion 
they are chosen as tiny powers of n, and this makes it hard to obtain a direct inductive proof of Q. 



Instead, we proceed in two stages. First, in Lemma 7.1 below we derive, without assuming A, s to 
be constants, a weaker bound for Q^(n, A, s), for which the induction is easier. Then we obtain the 
stronger bound (jl]) for constant values of A, s by using the weaker bound for the (d— 1) -dimensional 
queries on the exceptional parts, i.e., for the second term in the recurrence ([T]). 



A weaker bound for lower-dimensional queries. 

Lemma 7.1. For every v > there exists A dtU such that, with a suitable choice ofr and no, 

Q d (n, A, s) < (A^^n 1 - 1 /^ (3) 
for all d, n, A, s (with As > 2, say). 



Remarks, (i) This lemma may look very similar to our first result on T d A, s -range searching, 
Theorem 1 1.2| but there are two key differences — the lemma works for arbitrary point sets, with no 
general position assumption, and A and s are not assumed to be constants. 

(ii) Since query time 0(n) is trivial to achieve, we may assume v < 1/d, for otherwise, the bound 
Q in the lemma exceeds n. 

Proof. The case d = 1 is trivial because Qi(n, A, s) < CAslogn clearly implies ([5]), assuming 
that Ad, v > 1 + log 2 C and that n is sufficiently large (n > 4 would do). Thus, we assume that ([3]> 
holds up to dimension d — 1 (for all v > 0, A, s, and n), and we establish it for dimension d by 
induction on n. We consider A dv yet unspecified but sufficiently large; from the proof below one 
can obtain an explicit lower bound that A^ u should satisfy. We set 

n = no(d, A, s, v) := (As) dAd >" and r = (2CAs) 1 ^. 

This value of no is roughly the threshold where the bound ([3]) becomes smaller than n. Since we 
assume v < 1/d, our choice of r satisfies the assumptions r > A d and r > Ds, as needed in ([!]). 
In the inductive step, for n < no, 

Q d (n,A,s) <n< n^n 1 ^^ = (Aa) i4 *"n 1_1 / <l < {As) Ad >» n 1 ' 1 ' d+u . 
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So we assume that n > no and that the bound ^ holds for all n' < n. Using the induction 
hypothesis, i.e., plugging ([3]) into the recurrence ([T]), we obtain 

Q d {n,A,s) < CAar 1 -V d (As)^.»(n/r) 1 -V^" + |n|(Aisi)^ l -''n 1 - 1 /( d - 1 )+ , ' + r c . (4) 

By the choice of r, the first term of the right-hand side of ([4]) can be bounded by 

C&sr- u {ks) Ad >"n l - l l d+u = ^(As^n 1 - 1 /^ 

which is half of the bound we are aiming for. 

Next, we bound the second term. We use the estimates Aisi < (Ds) ad , | IX] < (Ds) ad , and 
Ds < r. Then 

\n\(Ai3i) Ad - 1 '' , n 1 ~ 1 ^ d ~ 1 ^ +u < r a d( A d-i,^+i) n i-i/(d-i)+^ 

< r n l~l/d+u (5) 

- n l/d(d-l) n ■ KJ> 

We choose 

A d)V > ^—d a d (A d _ 1)V + 1), (6) 
v 

where a! is a suitable constant, i.e., we choose A d}V = d°W jv d . Since n > no = (As) dAd - v and 
r = (2CAs) 1 ^, the fraction in ^ can be bounded by 

r a d (A d - 1:U +l) (2CAs) Ad >"/ a '( d_1 ) / 2C \ A d,„/a'(cH) 

n i/d(d-i) - ( As )A d ,,/(d-i) -^(As) '- 1 ; ~ 1 

provided a' > log 2 (4C) (recall that As > 2). The whole expression in (j5| is thus negligible 
compared with the desired bound (^As) Ad ' v n 1 ^ 1 / d+u . 

Finally, recall that c = d°^ l \ so our choice of A dtU (again, choosing a' sufficiently large) 
ensures that r c < n 1 ~ l / d . Hence, the right hand side in is bounded by (As) Ad - v n 1 ^ 1 d+u ', as 
desired. This establishes the induction step and thereby completes the proof of the lemma. □ 



The improved bound for the query time. Now we want to obtain the improved bound Q, i.e., 

Q d (n, A, s) < n 1_1 / d log B n, with B = B(d, A, s), assuming that A, s are constants and n > 2. 
To this end, in the top-level (d-dimensional) partition tree, we set r := n s , where 8 > is a suitable 
small constant to be specified later. Then we use the result of Lemma |7 . 1 1 with v := 2d ( d -i) f° r tne 
processing the (d — 1) -dimensional queries on the sets P*. Thus, in the forthcoming proof, we do 
induction only on n, while d is fixed throughout. 

We choose no = no(d, A, s) sufficiently large, and we assume that n > uq and that the desired 
bound ^ holds for all n' < n. In the inductive step we estimate, using the recurrence ([!]), the 
induction hypothesis, and the bound in Q, 

Q d (n,A,s) < CAsr 1 - 1 ' d {n/r) l - 1 l d log 5 (n/r) 

+ inKAxsi)^- 1 ^ 1 - 1 ^- 1 ^ + r c . 

The first term simplifies to (1 — 5) B C Asn 1-1 ld log B n. Thus, if we choose B depending on 5 
(which is a small positive constant still to be determined) so that (1 — 5) B CAs < \, then the first 
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term is going to be at most half of the target value n l ~ l l d \og B n. Thus, it suffices to set 5 so that 
the remaining two terms are negligible compared to this value. 

For the r c term, any 5 < I /2c will do. The second term can be bounded, as in the proof of 
Lemma [7T| by r ad ^ Ad - 1 ' v+1 ^n 1 ~ 1 ^ d ~^ +u . We substitute v = 2 d(d-i) an< ^ rewrrte the resulting 
expression as 



n l/2d{d-X) 



Thus, with 5 < (^a d {A d _i v + l)2d(d — 1)^ , the term is at most n 1 1 l d . Again, this establishes 
the induction step and concludes the proof of the final bound for the query time. 

Analysis of storage and preprocessing. Let S d (n, A, s) denote the size of the data structure on 
n points in M. d for r^A.s-range searching, with the settings of r and no as described above. For 
n < riQ = no(d, A, s) we have Sd(n, A, s) = 0(n). For larger values of n, the space occupied by 
the root of the partition tree, not counting the auxiliary data structure for the exceptional part P*, is 
bounded by r c , where c = d°^\ Furthermore, since S^(n, A, s) is at least linear in n, the total size 
of the auxiliary data structure constructed on P* is X^sii $d-\{ n -Ki Ai, si) < Sd_i(n*, Ai, si), 
where n* = \P*\. We thus obtain the following recurrence for S^(n, A, s): 

t 

S d {n, A,s)<Y, SdK, A, s) + S d ^{n*, A u 8l ) + 0(r c ) 



i=l 

for n > tiq = no(d, A, s), and S^(n, A, s) = 0(n) for n < uq. Using n\ < n/r, n* + £V < n, 
and r c = o(n), for both types of choices of r, the recurrence easily leads to 

S d (n,A,s) = 0(n), 

where the constant of proportionality depends on d. 

It remains to estimate the preprocessing time; here, finally, the parameter e > in Theorem 1.4 
comes into play. Let 5* be a constant such that r < n s * (at all stages of the algorithm). As was 
remarked in the preceding analysis of the query time, we can make 5* arbitrarily small, by adjusting 
various constants (and, generally speaking, the smaller 5*, the worse constant B(d, A, s) we obtain 
in the query time bound). 

Let T d (n, A, s, 6*) denote the maximum preprocessing time of our data structure for T^a.s - 
range searching on n points, with 5* > a constant as above. Using the operation (Al) of Section|4] 
we spend 0(nr c ) time to compute f2(/) and the partition of P into the exceptional part and the 
regular parts. We spend additional 0(nr c ) time to compute II and P* for every tt G II, where 
c = d ^. The total time spent in constructing the secondary data structures for all patches of II is 
bounded by T^_i(n*, Ai, si, 5*). Hence, we obtain the recurrence 

J d (n,A,s,5*) < J2 J d{ni,A,s,S*) +J d - 1 (n*,A 1 ,si,5*) + 0(nr c ) 
i=i 

for n > no, and J d (n, A, s, 5*) = 0(n) for n < no- Using the properties n, < n/r and n* + 
Yli n i — n > a straightforward calculation shows that 

l+c8* 



T d (n,A,s,5*) = 0(n L+cd ) 
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where the constant of proportionality depends on d. Hence, by choosing 5* = e/c, the preprocessing 
time is 0(n l+£ ). This concludes the proof of Theorem 1.4 □ 



8 Open Problems 

We conclude this paper by mentioning a few open problems. 

(i) A very interesting and challenging problem is, in our opinion, the fast-query case of range search- 
ing with constant-complexity semialgebraic sets, where the goal is to answer a query in O(logn) 
time using roughly n d space. There are actually two, apparently distinct, issues. The standard ap- 
proach to fast-query searching is to parameterize the ranges in V by points in a space of a suitable 
dimension, say t; then the n points of P correspond to n algebraic surfaces in this t-dimensional 
"parameter space", and a query is answered by locating the point corresponding to the query range 
in the arrangement of these surfaces. 

First, the arrangement has 0(n t ) combinatorial complexity, and one would expect to be able 
to locate points in it in poly logarithmic time with storage about n*. However, such a method is 
known only up to dimension 4, and in higher dimension, one again gets stuck at the arrangement 
decomposition problem, which was the bottleneck in the previously known solution of for the 
low-storage variant, as was mentioned in the Introduction. It would be nice to use polynomial 
partitions to obtain a better point location data structure for such arrangements, but unfortunately, 
so far all of our attempts in this direction have failed. 

The second issue is, whether the point location approach just sketched is actually optimal. This 
question is exhibited nicely already in the simple instance of range searching with disks in the 
plane. The best known solution that guarantees logarithmic query time uses point location in R 3 
and requires storage roughly re 3 , but it is conceivable that roughly quadratic storage might suffice. 



(ii) Our range- searching data structure for arbitrary point sets — the one with large fan-out — is so 
complex and has a rather high exponent in the polylogarifhmic factor, because we have difficulty 
with handling highly degenerate point sets, where many points lie on low-degree algebraic surfaces. 
This issue appears even more strongly in combinatorial applications, and in that setting it has been 
dealt with only in rather specific cases (e.g., in dimension 3); see p3]29] 34} for initial studies. It 



would be nice to find a construction of suitable "multilevel polynomial partitions" that would cater 
to such highly degenerate input sets, as touched upon in p3j[34|. 



(iii) Another open problem, related to the construction of polynomial partitions, is the fast evaluation 
of a multivariate polynomial at many points, as briefly discussed at the end of Section|3] 
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